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Introduction 


Research on school leadership shows that principals can significantly impact student 
achievement by influencing classroom instruction, organizational conditions, community 
support and setting the teaching and learning conditions in schools. 1 Moreover, strong principals 
provide a multiplier effect that enables improvement initiatives to succeed. 2 

Yet each year, as many as 22% of current principals retire or leave their schools or the profession, 3 
requiring districts to either promote or hire new principals to fill vacancies at considerable district 
cost. 4 For example, previous research showed the following: 

• Principals in rural and city schools were less apt to stay than those in suburban schools. 

• More high school principals have left than middle or elementary school principals. 

• The majority of principals have left between their third and fifth year in the position. 5 

The new generation of principals is younger and has less teaching experience. Today's principal 
workforce is more mobile, works more hours, and experiences more job stress. New principals 6 
report being underprepared to evaluate teaching, provide teachers meaningful feedback, manage 
conflict, and balance tasks. 7 

Because of these workforce changes, understanding how to better prepare new leaders for the 
role of principal is an urgent policy concern. Although there may be little disagreement that good 
principals make a difference, what is less clear is how to systematically prepare and keep good 
principals. 


1 See, for example, Clifford, Behrstock-Sherratt, and Fetters (2012); Hallinger & Heck (1998); Leithwood, 
Louis, Anderson, and Wahlstrom (2004); and Marzano, Waters, and McNulty (2005). 

2 Manna (2015). 

3 Principal turnover varies geographically; see Goldring and Taie (2014). 

4 School Leaders Network (2014). 

5 Goldring and Taie (2014). 

6 Throughout this text, we use the term new principal to mean a principal who is new to his or her position 
as a principal. The term new is used here to mean a principal who lacks professional experience as a 
certified principal, with responsibilities for overseeing an entire school. A principal who is experienced as 
a principal is not considered new because he or she opts to take a position in another school. 

7 Clifford (2012). 
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Evaluation of Principal Preparation Programs 

Across the United States, as many as 700 principal preparation programs are preparing and 
certifying principals to lead our nation's schools. Most states require principals to complete a 
preparation program to obtain an administrative certification, although the criteria for the design 
of preparation programs vary. 8 

The methods that preparation programs use to train principals vary nationally and are a source 
of concern among policymakers, university faculty, and educators. 9 Some programs are 
developing innovative approaches to principal preparation, and with recent legislation, 
preparation programs have a strong interest in determining whether new preparation programs 
make a difference in improving student learning. At the same time, states are showing more and 
more interest in evaluation methods that may show evidence of quality in principal preparation 
programs. 10 

Background: Principal Preparation Evaluation Study 

During the course of the last 2 years, the George W. Bush Institute, in partnership with the 
American Institutes for Research (AIR), has been evaluating the impact of five principal 
preparation programs in the United States on student outcomes. The methodology used in this 
study may be useful for preparation programs and others to examine effects on student outcomes 
and consider how to increase the effectiveness of the work of principals. 

To date, only a few principal preparation programs have been formally evaluated using student 
outcomes. * 11 Our goal in taking on this work was to try to extend the research base by evaluating 
additional programs and tackling some of the challenges involved. 

The five preparation programs that were selected for inclusion in the evaluation study were based 
on a set of criteria developed to reflect the best available theory and research on promising 
practices in principal preparation. 12 They included a mix of independent nonprofit and 
university-based programs. 


8 See Anderson and Reynolds (2015). The following states do not require the completion of a principal 
preparation program: Hawaii, Montana, New Hampshire, Ohio, Pennsylvania, South Dakota, Texas, and 
Vermont. 

9 Anderson and Reynolds (2015) and Levine (2009). 

10 University Council for Educational Administration (UCEA) and New Leaders (2016). 

11 For example, see RAND's evaluation of the New Leaders Program (Gates et al., 2014) and the Institute 
of Education and Social Policy's evaluation of the New York City Leadership Academy (Corcoran, 
Schwartz, & Weinstein, 2009). 

12 Cheney, Davis, Garrett, & Holleran (2010); Darling-Hammond, LaPointe, Meyerson, Orr, and Cohen 
(2007); George W. Bush Institute (2013); and Shelton (2012). 
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The key findings from this study are as follows: 

• We found little evidence that student achievement was any different in schools led by 
graduates of the programs being evaluated . 13 

• High-quality data were rarely available. We had difficulty getting data on principals, their 
assignments, and their experiences. Also, we could not get reliable data on other 
outcomes, such as school climate . 14 

• Program graduates had generally positive perceptions of program coursework and 
hands-on experiences, but they had mixed perceptions of district supports and ongoing 
supports from their programs. 

• Although the overall outcomes were not better (or worse) for the graduates of these 
programs, there were high-performing and low-performing graduates from each 
program. 

Taken together, these findings suggest that focusing on reducing variation in the performance 
of graduates through training, selection, or other means and systematizing or better tailoring 
supports may be the keys to success in preparing effective school leaders. 

Challenges With Evaluating Principal Preparation Programs 

The research team faced several decision points posed by various challenges in assessing the 
impact of principal preparation programs on student learning. These challenges are described in 
this section. The challenges described are pertinent to states, districts, and universities conducting 
similar studies for program accountability or continuous improvement. 

Others have written about challenges in evaluating principal preparation programs and, 
specifically, in using student outcome data for evaluation . 15 Some of these challenges are similar 
to those raised for evaluating teacher preparation programs using student outcomes , 16 such as 
the difficulty of disentangling selection into the preparation program from the outcomes. Others 
are more specific to principal preparation, such as the fact that many individuals completing 
principal preparation programs do not immediately become principals or that principals affect 
achievement less directly than classroom teachers. 


13 Data limitations in one of the five districts impeded us from doing the full analyses. We focus here on 
findings from four of the five originally selected programs. 

14 For example, one district was unable to provide information on principal experience in or outside the 
district. 

15 See Burkhauser, Pierson, Gates, and Hamilton (2012); Fuller and Hollingworth (2014); and Grubb, Liao, 
and Cheung (2014). 

16 For example, see Baker et al. (2010); Darling-Hammond et al. (2013); and Kennedy (2010). 
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These challenges do not diminish the importance of tracking program completers 
after they graduate and gathering and rigorously analyzing data about their 
placement, retention, and school and student outcomes, including achievement. They 
do, however, point to the difficulty of comparing across schools and the need to 
consider a variety of factors in determining the overall effectiveness of any single 
preparation program. 


Challenge: The lack of reliable and consistent data on outcomes other than 
achievement can limit analysis. 

In addition to looking at student achievement as an outcome, additional proximal measures of 
principal effectiveness should be considered for evaluation. Examples of these are school climate, 
teacher retention, and principal and teacher effectiveness. Evidence of progress on these 
outcomes may occur sooner than effects on student achievement, and they can help reveal a more 
complete picture of what is happening in a school. 

For example, examining teacher retention in and of itself would not be sufficient because 
removing ineffective teachers from the workforce is one lever by which new principals can 
improve school outcomes. Instead, an examination of the extent to which principals retain 
effective teachers is more accurate. And although many states and districts have developed and 
implemented more rigorous teacher evaluation measures, these systems are still in their early 
stages and generally do not show variation in effectiveness, making them less useful for 
evaluation purposes. 

Unfortunately, it can be difficult to examine these and other outcomes because of constraints in 
data availability and resources. Despite its potential to inform district decision making and 
research on best practices, high-quality data that would allow us to look at these outcomes are 
rarely available within districts. 

Recommendation: Program staff, districts, evaluators, and policymakers should understand the 
strengths, challenges, and availability of data related to any potential outcomes of interest (see, 
for example. Table 1). Whenever possible, multiple measures should be used during an 
evaluation. 17 If only one outcome can be used because of data limitations, careful consideration 
should be taken in data collection and analysis. Contextual factors should be described to the 
extent possible. 


17 UCEA and New Leaders have recently developed a toolkit that might be considered when determining 
outcomes of focus. See http://www.sepkit.org/publications/ . 
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Table 1. Outcomes of Interest in Evaluating the Impact of Principal Preparation Programs 


Outcome of 

Interest 

Strengths 

Challenges 

Student 

achievement 
(test scores) 

■ Consistently available across 
schools and districts, easily 
comparable 

■ Typically available only for Grades 3-8 and one 
grade in high school. 

■ Tests and cut scores change across time, although 
this can be controlled for to some extent. 

■ An indirect outcome of principal practice is that 
principals account for a small amount of 
variation in student test scores. 

School 

climate 

■ Directly related to principal 
practice 

■ Associated with improved 
student achievement 

■ Data are not consistently available or of sufficient 
quality to use in analysis. 

■ Where school climate was available, it only 
included student responses. Arguably, teacher 
responses are more critical to understanding 
climate changes in a school. 

Graduation 

rates 

■ A measure of student 

achievement at the 
secondary level 

■ Consistently available across 
schools and districts, easily 
comparable 

■ Available only for secondary schools. 

■ Large changes in graduation rates may be 
difficult to achieve in the short term (and 
therefore harder to detect principal impact). 

■ States do change the requirements for 
graduation, which can affect long-term analysis. 

Teacher 

retention 

■ Directly related to principal 
practice 

■ A lever that principals might 
use to impact student 
outcomes 

■ Would need to combine information on retention 

with valid information on teacher effectiveness 
that is able to capture variation in teacher quality 
and performance to interpret findings. 

Principal 

practice 

measures 

■ Directly related to principal 
practice, and a primary focus 
of preparation programs 

■ Few validated measures of principal practice 
exist. 

Teacher 

practice 

measures 

■ Directly related to principal 
practice and a primary focus 
of preparation programs 

■ Measures of teacher practice are still being 
developed. 


In addition, districts should collect and analyze data that helps them understand where their best 
leaders are trained, when a leader may need additional support and resources, and how to keep 
the best leaders in a district. This may mean allocating greater resources toward data-collection 
efforts. Preparation programs and districts should work together to develop data-sharing 
agreements that allow them to work together for improvement . 18 


18 For more information on data availability, see the brief titled What Districts Know— and Need to Know — 
About Their Principals (George W. Bush Institute, 2016). 
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Challenge: Principals affect student achievement indirectly. 

Principals are responsible for setting the conditions for learning in their schools (e.g., supporting 
teaching practice and establishing a school climate conducive to learning), but they rarely provide 
direct instruction to students. Nonetheless, it seems logical that well-prepared principals ought 
to be able to affect achievement more than principals who are less well prepared— perhaps not in 
their first year but in the long term. An evaluation can show statistically significant impacts on 
student achievement by principals, but it should be noted that these gains are expected to be 
smaller than teacher impacts because of the indirect effect of principals. 19 

Recommendation: One way to address this issue is to carefully select programs that appear to 
embrace many of the practices recommended by experts to prepare principals well. 20 One might 
theorize that graduates of these programs would be more likely to affect student achievement 
and other outcomes compared with peers from other programs. 

If programs are identified whose graduates are having systematic positive impacts on student 
achievement, then the field could work to learn more about these programs' practices. It should 
be noted, however, that the results would not explain how or why that program was effective. 
Additional research would need to be done to determine specific program practice effects. 

Challenge: Sample sizes of principals from programs may be small, which 
can make the analysis and interpretation of findings difficult. 

The number of principals from any given program may be small for a variety of reasons. First, 
many programs simply do not graduate a very large number of principals in any given year. So 
even with a long panel of data, it can be difficult to identify a large number of principals from 
any single program. 

Further, most programs in this study had no or very few principals assigned to high schools, 
making the analysis of high school outcomes difficult. Also, principal turnover can reduce sample 
size. This turnover can make it difficult to evaluate the long-term effects of principal graduates. 21 

Finally, an evaluation methodology itself might restrict the number of principals eligible for 
inclusion. For example, in the Bush Institute study, one of the methods used required that schools 
have 3 years of achievement data prior to a principal's placement to be included in the study. 
These prior data establish a trend line for the school for the time before the new principal entered, 
which is an important element to fairly examine what happened after the principal was placed. 
However, this approach meant that some schools and principals were excluded (including 


19 Gates et al. (2014). 

20 For example, see Cheney et al. (2010); Darling-Hammond et al. (2007); and George W. Bush Institute 
(2013). 

21 The Bush Institute study includes a descriptive analysis of attrition among principals from selected and 
other programs. 
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schools where a program principal had been placed, then left, and was replaced by a different 
principal from the same program). 

Recommendation: It is important that evaluators, researchers, and others understand that small 
sample sizes are a concern with all research on principals because there are many fewer principals 
than teachers and students. In addition, when focusing on a subset of principals, such as new 
principals, the sample size is even smaller. 

When developing a methodology for evaluation, a firm set of decision rules aimed at balancing a 
larger sample size with data quality should be set. Multiple analyses or methods can be used to 
triangulate findings. In the Bush Institute study, a second analysis was performed that allowed 
the inclusion of additional principals to account for this. 

Challenge: It may take a principal time to affect outcomes at a school. 

Across time, principals may become more effective at their jobs. 22 Indeed, the length of a 
principal's tenure at a school may itself be related to the principal's effectiveness. Perhaps the 
most important reason that it takes time to improve student learning is that the most important 
lever for doing so is to improve teacher practice, which takes time. 23 Some research indicates that 
it may take as many as 3 years for principals to make a difference in student achievement. 24 

Student achievement may even take a "dip" after a new principal enters a school, perhaps because 
a new principal is learning on the job or for other reasons. These factors may point to the benefit 
of evaluating the impact of principals after several years of their placement. At the same time, 
however, long-term tracking of principals within schools can be challenging. Principal turnover 
rates and mobility are high, particularly in underserved districts where some preparation 
programs intentionally place principals as part of their mission. 

Recommendation: One solution is to focus on a principal's first 3 years on the job. Methodologically, 
the Bush Institute study dealt with the fact that schools may experience a dip in achievement when a 
new principal arrives by looking at the relative change in achievement compared with other similar 
schools with newly placed principals. So, even if achievement went down, the question is, "Did it go 
down more on average than in other schools with newly placed principals?" 

While looking at the first 3 years is preferable to only looking at the first year a principal is placed, 
longer term evaluations are recommended to understand the full long-term impacts of a 
principal. Greater resources may need to be allocated for evaluation work so that programs, 
districts, and others can get a better understanding of how principals are affecting student 
outcomes. 


22 Beteille, Kalogrides, and Loeb (2011); Clark, Martorell, and Rockoff (2009); Coelli and Green (2012); and 
Seashore Louis, Leithwood, Wahlstrom, and Anderson (2010). 

23 See, for example, Wiliam (2016). 

24 Corcoran et al. (2012). 
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Challenge: Estimates of principal program effects on student achievement 
are difficult to disentangle from many other factors, such as selection into 
programs 25 or supports provided by districts. 

An additional challenge in evaluating principals' effectiveness across a longer period as they gain 
experience is that it is possible that any knowledge and skills principals gained in their programs 
could fade or become entangled with on-the-job learning, principal supervision and evaluation 
processes, or school- or district-provided supports. The graduate interviews that were a part of 
this study identified variability in the value perceptions of district supports. 

Recommendation: Attributing changes in student achievement to principal preparation 
programs means using methods that can best support that kind of causal claim, while attempting 
to control for as many other factors as possible. One method that might be considered is a 
comparative interrupted time series (CITS) approach, which was one of the methods used in the 
Bush Institute study. 

The CITS approach is a quasi-experimental design method recognized by the What Works 
Clearinghouse and is a rigorous approach to evaluation. It treats a new principal's entry from a 
particular program into a school as a school-level intervention that we can compare to other 
schools that also received new principals at the same time but from other programs (see Figure 1). 


Figure 1. Comparative Interrupted Time Series Approach 



The CITS approach uses 3 years of data prior to a new principal's arrival to establish trends for 
schools and then compares changes in achievement in schools where a principal from a program 
of interest was placed to changes in other schools that also received newly placed principals. In 
this manner, schools can essentially be compared with their own baseline. 


25 Of course, one also could argue that selection is an element of program quality, although it is true that 
the pool of possible candidates is not necessarily the same across all programs because of geography and 
other factors. 
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One limitation of the Bush Institute study is that data shortcomings prevented a complete 
understanding of the comparison principals in some districts. For example, some districts were 
unable to provide data on a principal's experience before he or she entered the district or where 
the person was trained. It also would have been useful to be able to track professional 
development and supports received by all principals in a district. The CITS approach still allows 
an evaluator to look at how individual principals affected outcomes compared with a school's 
baseline, but the results should be interpreted with this in mind. 

In addition, qualitative research should accompany a quantitative evaluation to understand the 
types of supports from districts and principals themselves. The following topics might be 
considered when designing interview questions: 

• Induction support for new principals 

• Coaching and mentoring 

• Principal professional development 

• Grant-funded programs and initiatives targeted at improved student outcomes 

Although it may be impossible to control for all additional "noise" in an evaluation, adding a 
qualitative component can be helpful in determining what other initiatives are happening during 
a principal's tenure to provide helpful context. 

Conclusion and Recommendations 


Evaluating the impact of principal preparation programs is essential to improve programming, 
inform policy, and provide information to consumers. Although challenges were experienced 
during this process, preparation programs need to know if their graduates are moving the needle 
on student and school outcomes to inform continuous improvement, and districts need 
information about which programs are producing the best principals to inform critical pipeline 
decisions. 

Although the information produced by an impact evaluation may have limitations in what it can 
tell us about best practices in principal preparation program design, this information should be 
considered as one aspect of preparation program improvement and accountability. The study 
team recommends the following: 

• Without consistent, timely, and comparable data, neither districts nor researchers can 
track how much training a principal has received; whether graduates of different 
preparation programs show differences in retention, performance, or other outcomes; or 
how district-provided programs may help principals achieve their goals. Because of this, 
the following is recommended: 

Accurate and comprehensive data collection systems and analysis measures are 
needed to improve both principal preparation and principal performance . 26 


26 Principal performance evaluation accuracy and reliability should be improved so that postgraduation 
job performance can be used as a preparation program quality measure. 
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States and districts should collect data that are more systematic on outcomes in 
addition to student standardized test scores (e.g v school climate, teacher 
workforce data, and principal workforce data). 

• It is possible to estimate program effects in a manner that is methodologically rigorous 
and fair in terms of controlling for factors that are associated with student outcomes. 
However, there are important limitations to these data and how they should be used. 
Because of this, the following is recommended: 

Impact evaluations using student outcomes are necessary and critical for 
program improvement and district awareness. 

A full evaluation of principal preparation programs should include multiple 
measures, and focusing on measures of individual principal effectiveness in 
addition to overall averages may be very informative. 

• Research is clear that principals are critical to school effectiveness, but research on many 
issues related to preparing and supporting great principals is just emerging. In addition, 
preparation alone will not solve the issue of ensuring that every school has a great 
principal. Because of this, the following is recommended: 

More sustained research is needed on issues of developing and retaining our 
most effective principals. 

Education leadership also can learn from the body of leadership research in 
other fields. 

Districts and policymakers should consider how preparation fits into a 
continuum of development and supports along principals' career pathways and 
work to improve all aspects of principal talent management . 27 


27 See the Principal Talent Management Framework, which presents suggested policies and practices in 
the areas of preparation, recruitment and selection, professional learning, evaluation, compensation and 
incentives, and working environment. 
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